POSEIDON goes to Bali

Ernesto

February 25, 2017

Introduction

  • Why bother with Agent-Based Models ?
  • How POSEIDON works
  • How can we use POSEIDON in Indonesia?

Agent Based Models

Simulation

  • (some) Knowledge about individuals
  • Hard to Aggregate
  • Simulate

Traffic - 1

Traffic - 2

Why using a model at all ?

  • Prediction
  • Control
  • Hypothesis Testing

Evacuation - 1

first step

Evacuation - 2

first step

Evacuation - 3

first step

Evacuation - 4

first step

Evacuation 5

first step

Smallpox 1

first step

Smallpox 2

first step

Smallpox 3

  • Targeted Vaccines
  • School / Work Closures

Misconceptions

  • ABMs need lots and lots of data
    • 50 out of 113 land-use ABMs used no data
    • 30 out of 82 used only qualitative data
  • ABMs are complicated
  • ABMs should be complicated

Where ABMs work best

  • Adaptation
  • Interdependence
  • Heterogeneity
  • Limited Knowledge

Why a fishery ABM?

  • Adaptation
    • to incentives
    • to regulations
  • Interdependence
    • direct (fisher)
    • indirect (biological)
  • Heterogeneity
    • strategies
    • legality
    • distributional effects
  • Limited Knowledge
    • fishers’
    • policy-makers’

POSEIDON

What do we want

  • Policy Simulator
    • Agents Flexibility
    • Model Flexibility
  • Final Objective: to connect indicators with actions

What do we want from this meeting

  • Show
    • Abstract model runs
    • Flexibility
  • Ask
    • What would you do with it ?
    • Given a year of development time, what would you spend it on?

Why do we want it ?

  • Failures of the Commons and Gordon, 1954
  • Effort control and capital stuffing
  • Gear regulations and efficiency drops
  • Area closures and fishing the lines
  • Quotas and its allocation

How others do it ?

  • Random Utility Models
    • Statistically Efficient
    • Easily Generalizable
    • Policy-Brittle
  • Dynamic Programming
    • Strongly Rational
    • Computationally Expensive
    • Ad hoc

The One Agent Problem

one armed bandit

The One Agent Problem

  • Find the most profitable spot to fish
  • Constraints:
    • No biomass information
    • No model knowledge
    • Environment changes over time
  • Subproblems:
    • How to explore
    • Explore-Exploit Tradeoff

Explore or Exploit ?

first step

Explore or Exploit ?

first step

Explore-Exploit

  • Stochastically choose to explore next trip with probability \(p\)
  • Explore in the neighborhood of where you currently go

One Agent world

One Agent sample run

Scaling issues

Scaling issues

first step

Scaling issues

first step

L’enfer, c’est les autres

  • Other boats consume biomass
  • You can use other boats information
  • How to imitate?
  • With probability \(p\) explore, otherwise copy somebody who’s doing better

Explore-Exploit-Imitate

# with probability epsilon, explore
if(runif()<epsilon)
  # shock your position by delta
  position <- position + runif(min=-delta,max=delta)
else
  # if a friend is doing better, play their slot machine
  if(profits[me] < profits[best_friend])
    position<- positions[best_friend]
  # otherwise play the previous slot machine
play(position)

Two Agents sample run

Many Agents

Cui prodest?

  • Model free
  • Adaptive

Oil Prices

Fish the Line (part 1)

Fish the Line (part 2)

A flexible simulator

  • Flexible in terms of:
    • Decisions
    • Biology
    • Algorithms

Target Switching

Gear Selection

OSMOSE

West Coast

WFS

Gravitational Search - Demo

Kernel Regression

Kernel Regression - Demo

Policies

Simulating Policies

  • Open Loop
    • Scenario Evaluation
    • Policy Optimization
  • Closed Loop
    • Policy Search
    • Policy Discovery

Open Loop

Scenario Evaluation

  1. You have adaptive agents
  2. Somebody hands you a set of policies to test
  3. Apply each in turn
  4. Check which performs best

TAC vs ITQ (mileage)

TAC vs ITQ (catchability)

Seventy-Thirty World

ITQ Prices

  • Quotas are distributed 90% reds, 10% blue

Blues are choke species

ITQ drives Gear (start)

ITQ drives Gear (end)

Gear fixes wastes

North-South world

Location choices

ITQ incentivates geography

Layered Policy - setup

Layered Policy - Anarchy

Layered Policy - MPA

Layered Policy - MPA + ITQ

Policy Optimization

  1. You have adaptive agents
  2. Somebody hands you a family of policies
  3. You want to find the “best” parameters

Optimal MPA

  • Geographically split world
  • Find the single MPA that maximizes a score

Optimal MPA

\[ \text{Score} = \text{Blue Biomass}_{t=20} \]

Optimal MPA

\[ \text{Score} = \text{Blue Biomass}_{t=20} + \sum_{i=1}^{20} \text{Red Landings}_{t=i}\]

Optimal MPA - Well-mixed

Perfect Enforcement

No Enforcement

Decent Enforcement

  • Hourly probability 15% caught
  • 1000$ fine

Why do people cheat?

Optimal level of enforcement

  • You want to maximize 20 years profits
  • You can impose an MPA

Free Enforcement

  • MPA is free and enforcement is perfect

first step

Expensive Enforcement

  • Enforcement costs: \[ 10M * p + 10000 * \text{MPA area} \]

first step

Optimal Quotas

\[ \text{Score} = \text{Blue Biomass}_{t=20} + \sum_{i=1}^{20} \text{Red Landings}_{t=i}\]

  • Geographically split map
  • 300 fishers
  • Very different quota values for TAC and ITQ

Optimal TAC

Optimal ITQ

Well-mixed world?

In a scenario where fishers are unable to respond to incentives the optimal quotas under TACs and ITQs are exactly the same

In a scenario where fishers are unable to respond to incentives the optimal quotas under TACs and ITQs are exactly the same

Pareto Front

Heterogeneous fleets

  • 2 kinds of boat:
    • Small boats
    • Large boats
  • 2 Objectives:
    • Maximize small boat income
    • Maximize efficiency
  • 1 Policy lever:
    • Build MPA

Fairness Front

Right-most solution

Left-most solution

Closed Loop

Bluemania

  • Well mixed world
  • Want to incentivate gear change through a landing tax
  • Blue fish worth 3 times red fish

No intervention

PID Taxation

  • Expensive (blue) stock gets consumed too rapidly
  • Geographically separated
  • Update tax smoothly such that every day only about 600 units of blue stock is landed daily
  • Poor man’s quotas
  • Use a PI controller \[ p_{t+1} = a e_t + b \sum_{i=0}^T e_{i} \] \[ e_t = \text{Landings} - 600 \]
  • “Autopilot” policy
  • Parameters matter
  • Noise matter

PID Taxation - demo

PID Taxation - optimal

Policy Discovery

  1. You have adaptive agents
  2. You have state (possibly degraded) indicators and action levers
  3. You have to figure out how to link the two together
  4. You want the decision rule to be optimal

Policy Discovery

Biomass-based control

  • 1 species of fish
  • 300 Fishers

Random Controller

Bayesian Controller - Quota

Reinforcement Learning

  • Can’t set quotas
  • Can only open/close fishery each month
  • Biomass and time of the year our only indicators.
  • Train it 1000 episodes, \(\gamma = .999\)

20 years

80 years

Comparisons - 20 years

Method 20 Years 80 Years
Quota - optimized 20 years 412,056 390,581
Biomass controller 352,566 1,058,428
Random controller 398,069 390,678
Anarchy 230,225 202,231

Revenue-based controller

  • Perfect biomass monitoring is impossible
  • Can we create a controller looking only at average profits and distance from port (human dimensions)?
  • Train it 2000 episodes, \(\gamma = .999\)

80 years

Comparisons

Method 20 Years 80 Years
Quota - optimized 20 years 412,056 390,581
Biomass controller 352,566 1,058,428
Random controller 398,069 390,678
Anarchy 230,225 202,231
Cash-distance controller 326,116 1,001,269

Problems

  • Rough around the edges
  • Often does not converge
  • Opaque result, hard to describe \[ Q(I,a) = \alpha + \sum \beta_i \cos(c \pi I) \]